Chapter 17
More of a Good Thing: Multiple Regression
IN THIS CHAPTER
Understanding what multiple regression is
Preparing your data and interpreting the output
Understanding how interactions and collinearity affect regression analysis
Estimating the number of participants you need for a multiple regression analysis
Chapter 15 introduces the general concepts of correlation and regression, two related techniques for
detecting and characterizing the relationship between two or more variables. Chapter 16 describes the
simplest kind of regression — fitting a straight line to a set of data consisting of one independent
variable (the predictor) and one dependent variable (the outcome). The formula relating the predictor
to the outcome, known as the model, is of the form
, where Y is the outcome, X is the
predictor, and a and b are parameters (also called regression coefficients). This kind of regression is
usually the only one you encounter in an introductory statistics course, because it is a relatively simple
way to do a regression. It’s good for beginners to learn!
This chapter extends simple straight-line regression to more than one predictor — to what’s called the
ordinary multiple linear regression model, or more simply, multiple regression.
Understanding the Basics of Multiple Regression
In Chapter 16, we outline the derivation of the formulas for determining the parameters of a straight
line so that the line — defined by an intercept at the Y axis and a slope — comes as close as possible
to all the data points (imagine a scatter plot). The term as close as possible is operationalized as a
least-squares line, meaning we are looking for the line where the sum of the squares (SSQ) of vertical
distances of each point from to the line is the smallest. SSQ for a fitted line is smallest for the least-
squares line than for any other line you could possibly draw.
The same idea can be extended to multiple regression models containing more than one predictor
(which estimates more than two parameters). For two predictor variables, you’re fitting a plane,
which is a flat sheet. Imagine fitting a set of points to this plane in three dimensions (meaning you’d be
adding a Z axis to your X and Y). Now, extend your imagination. For more than two predictors, in
regression, you’re fitting a hyperplane to points in four-or-more-dimensional space. Hyperplanes in
multidimensional space may sound mind-blowing, but luckily for us, the actual formulas are simple
algebraic extensions of the straight-line formulas.
In the following sections, we define some basic terms related to multiple regression, and explain when
you should use it.
Defining a few important terms